Next-generation sequencing of host genetics risk factors associated with COVID-19 severity and long-COVID in Colombian population

Coronavirus disease 2019 (COVID-19) was considered a major public health burden worldwide. Multiple studies have shown that susceptibility to severe infections and the development of long-term symptoms is significantly influenced by viral and host factors. These findings have highlighted the potential of host genetic markers to identify high-risk individuals and develop target interventions to reduce morbimortality. Despite its importance, genetic host factors remain largely understudied in Latin-American populations. Using a case–control design and a custom next-generation sequencing (NGS) panel encompassing 81 genetic variants and 74 genes previously associated with COVID-19 severity and long-COVID, we analyzed 56 individuals with asymptomatic or mild COVID-19 and 56 severe and critical cases. In agreement with previous studies, our results support the association between several clinical variables, including male sex, obesity and common symptoms like cough and dyspnea, and severe COVID-19. Remarkably, thirteen genetic variants showed an association with COVID-19 severity. Among these variants, rs11385942 (p < 0.01; OR = 10.88; 95% CI = 1.36–86.51) located in the LZTFL1 gene, and rs35775079 (p = 0.02; OR = 8.53; 95% CI = 1.05–69.45) located in CCR3 showed the strongest associations. Various respiratory and systemic symptoms, along with the rs8178521 variant (p < 0.01; OR = 2.51; 95% CI = 1.27–4.94) in the IL10RB gene, were significantly associated with the presence of long-COVID. The results of the predictive model comparison showed that the mixed model, which incorporates genetic and non-genetic variables, outperforms clinical and genetic models. To our knowledge, this is the first study in Colombia and Latin-America proposing a predictive model for COVID-19 severity and long-COVID based on genomic analysis. Our study highlights the usefulness of genomic approaches to studying host genetic risk factors in specific populations. The methodology used allowed us to validate several genetic variants previously associated with COVID-19 severity and long-COVID. Finally, the integrated model illustrates the importance of considering genetic factors in precision medicine of infectious diseases.


DNA extraction and custom NGS panel sequencing
Genomic DNA was extracted from peripheral blood samples using the Quick-DNA™ Miniprep Plus Kit (Zymo Research) and assessed for quantity and quality.Genomic DNA was quantified using a nanodrop spectrophotometer.All samples were aliquoted and stored at 4 °C until analysis.
We performed targeted sequencing in 112 patients using a custom NGS panel.We considered two sets of target regions based on evidence reported in prospective cohorts, systematic reviews, meta-analyses, case-control analysis, GWAS and transcriptome-wide association studies (TWAS) 12, . The frst set of targets were candidate genes associated with COVID-19 severity and long-term complications.The second set of targets were candidate genetic variants associated with COVID-19 severity and long-term complications.In total, 74 genes and 81 genetic variants were selected for analysis (Supplementary Table s1 and s2).
A total of 947 probes were designed using the SureDesign software, with an overall probe size of 214 bp.Hybrid capture-based enrichment of the target regions was performed using the SureSelect Custom Tier1 DNA Target Enrichment Probes (Agilent).Library preparation and capture were performed using the SureSelect XT HS2 Target Enrichment protocol (Agilent) and sequencing was performed in a DNBSEQ.G400 instrument (MGI).Enrichment, library preparation, capture and sequencing were performed by Gencell (Bogota D.C., Colombia).

Bioinformatic analysis
The quality of the raw FASTQ files was evaluated using FastQC software (v0.10.0) 80.Raw reads were trimmed to remove low-quality reads (< 80% Q30).Filtered reads were mapped to the reference genome GRCh37/hg19 human genome using the Burrows-Wheeler aligner (v0.17.17) and variants called using the Sentieon software package (DNAseq 202,010.02) 81,82.The Sentieon DNAseq software is a licensed workflow used to perform variant detection implementing GATK Best Practices.The critical steps for this workflow included mapping reads to the reference genome (GRCh37/hg19), duplicates marking, indel realignment, base quality score recalibration (BQSR) and variant calling.This workflow has demonstrated strong computational performance and accuracy compared to other pipelines, including GATK 82 .The resulting Variant Call Format (VCF) files were annotated using the VarSeq software (Golden Helix) 83 .Variants were filtered according to the following quality parameters: (1) FILTER = PASS, (2) QUAL ≥ 30, and (3) Depth coverage ≥ 10X.Variants must fulfil all the previous requirements to be included in the downstream analysis.Sequencing depth and coverage were assessed using the "bedcov" function in SAMtools (v1.12) 84 .
Variant pathogenicity was classified using different approaches.First, we considered the molecular consequence of the variant categorizing as pathogenic the Loss-of-function (LoF) (frameshift, nonsense, and canonical splice site) variants.Second, for the missense variants, we used the Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants (REVEL) and classified as pathogenic those with a REVEL score > 0.5 85 .

Genetic analysis and linkage disequilibrium
We conducted two types of genetic analyses based on the set of targets.First, for the candidate variants, population genetic analyses including allelic frequencies, genotypic frequencies and Hardy-Weinberg equilibrium (HWE) were assessed using the SNPStats software 86 .The deviation of the HWE was established using a χ2 goodnessof-fit test with 1° of freedom (df).The bivariate association analysis between the candidate polymorphisms and COVID-19 severity or the presence of long-COVID was performed with the PLINK software (v1.9) 87 .The association was evaluated under several genetic models (allelic, genotypic, dominant, and recessive) using the Cochran-Armitage trend, genotypic (2df), dominant gene action (1df), and recessive gene (1df) tests.The Linkage disequilibrium (LD) between the variants localized in the same chromosome was determined by applying the D' value in Haploview (v4.2) 88 .
Second, for candidate genes, we implemented a bioinformatic filter to identify molecular variants potentially pathogenic as mentioned previously.For these variants, populational and genetic parameters were calculated including allelic frequencies, genotypic frequencies and HWE.

Statistical analysis and predictive model
Descriptive analysis was performed for all variables.Frequency tables were generated for qualitative variables, whereas measures of central tendency and dispersion were calculated for quantitative variables.Normality was computed by the Shapiro-Wilks test.Variables with normal distribution were expressed in terms of mean and standard deviation.Median, range and upper and lower limits were chosen if the variables did not follow normality.
A bivariate analysis was conducted to evaluate the association between clinical and host-genetics factors and COVID-19 severity and the presence of long-COVID in cases and controls.T-Student and Mann-Whitney tests were used to compare quantitative variables, whilst χ2 test was used to analyze qualitative independent variables.
For genetic variants, the bivariate analysis was performed based on the following genetic models: allelic (D vs d), dominant (DD, Dd vs dd) recessive (DD vs Dd, dd), and codominant (DD vs Dd vs dd), considering (D) as the major allele and (d) as the minor allele.χ2 statistic was used with 1° of freedom for the dominant and recessive model, while 2° of freedom was selected for the genotypic model.The Cochran-Armitage test was also incorporated for genetic variables that violated HWE.Odds ratios and their respective 95% confidence interval were calculated for sociodemographic, clinical, and genetic variables.
Statistically significant variables (p < 0.05) selected by the bivariate analysis were chosen for the construction of the multivariate binary logistic regression model.The best model was estimated using the Stepwise Backward method 89 .Wald test was used to evaluate de significance of the individual coefficients.Model assumptions were verified, including non-collinearity, homoscedasticity, and non-error correlation.Model performance and goodness of fit were measured using the Hosmer-Lemeshow test, moreover, the discriminatory capacity of the model was tested using the ROC curve.All data processing and analysis were done using R language (v4.2), whilst PLINK was used for genetic risk modelling.

Clinical and demographic data
The total number of recruited patients was 144, with 77 classifieds as cases and 67 as controls.Out of these, 117 patients completed the clinical follow-up.Two patients and one family member requested voluntary withdrawal of the study, one patient had an incomplete diagnostic algorithm and another patient had insufficient DNA for analysis.Although two patients in the case group died, interviews were completed aided by family members.In the end, the analysis was performed on 56 cases and 56 controls.A summary of the enrollment process is presented in Fig. 1.
Table 1 summarizes the clinical and demographic characteristics of our study sample.The median age was similar for both cases and controls, 48 years.Men were overrepresented in the case group, accounting for 62.5% (n = 35) of cases and 42.8% (n = 24) of controls.The most frequent comorbidities were diabetes mellitus, hypertension and obesity.Additionally, 53.6% of total patients did not have any comorbidity (n = 60), whilst 19.6% (n = 22) had 2 or more.The most common symptoms in both groups were fatigue 78.6% (n = 88), musculoskeletal pain 75.9% (n = 85), headache 67.9% (n = 76), and cough 67% (n = 75).The average symptom recovery time was 23 days (± 12) for cases and 19 (± 23) for controls (Supplementary Table s3).
Long-COVID was present in 78.5% of cases (n = 44) and 39.2% (n = 22) of controls."Common signs and symptoms", including fatigue, headache, insomnia, odynophagia, hair loss, weight loss and diarrhea, were the most frequent findings in both groups with 41% (n = 46), followed by "neurological signs and symptoms", present in 33.9% (n = 38).Clinical and demographic characteristics of patients according to long-COVID status are detailed in Supplementary Table s4.The median age for long-COVID patients was 48 (21-60), whereas for patients without this sequel was 45 (23-60).The phenomenon was more frequent in females (64.1%) than in males (54.2%).The prevalence of signs and symptoms in patients with and without long-COVID is presented in Supplementary Table s5.

Bioinformatic quality control
In total, we obtained 738.2 million reads, with an average of 6,591,339 reads per sample.For candidate variants, seven variants with a depth lower than 10X in more than 5% of the samples were removed (rs622568, rs1981555, rs7310667, rs11085727, rs13050728, rs113661667 and rs143334143).Genotypes of variants for patients with sequencing depth lower than 10X were designed as unknown (./.).The mean depth for the candidate variants was 996.4X (75.2X-2782.7X)(Supplementary Table s7).Regarding candidate genes, the target region spanned 179.9 Kbp and included the coding and 50 bp of flanking intronic sequence per exon.Transcript selection and variant nomenclature were based on the principal transcript identified in Ensembl 90 .Coverage above 20X was 99.05% and the average depth was 1205.9X(503.5X-1987.8X)for all the candidate genes (Supplementary Tables s8 and s9).

Candidate variants analysis
Descriptive population genetic statistics for the candidate COVID-19 variants, including allelic and genotypic frequencies, and HWE equilibrium by case and control groups are presented in Table 2. Similarly, a descriptive analysis by presence or absence of long-COVID is reported in Supplementary Table s10.Excluding variants rs41264915, rs2232354, rs147509469, rs4424872 and rs73510898 all SNVs were found to be in HWE (93.1%; n = 67).

Candidate genes analysis
A total of 291 variants were identified in the 74 candidate genes related to severe COVID-19 or long-COVID.After our filtering strategy, we obtained 65 variants, from which 69.2% (n = 45) correspond to LoF variants and 30.8% (n = 20) stand for predicted pathogenic missense variants (REVEL score > 0.5).Regarding LoF variants, TLR3 was the gene harboring the higher number of variants (n = 9) followed by MUC1 (n = 7).All the other genes had less than 5 LoF variants.Likewise, FOXP4 was the gene accounting for the highest number of predicted pathogenic missense variants (n = 3) followed by DPP4 and FUT2 with 2 variants each (Table 3, Supplementary Table s11).LoF variant frequencies among the cases were slightly higher (n = 36) than in the controls (n = 30), nevertheless, this difference was not statistically significant (p = 0.33).Conversely, the number of predicted missense pathogenic variants in the control group was higher than in the group of cases (n = 34 vs n = 22), with a significant difference (p = 0.03).On the other hand, several genes such as TLR3 (n = 6), OAS3 (n = 2) and APOE (n = 1) presented LoF variants exclusively in the case group.Similarly, THBS3 (n = 1) and ATP11A (n = 1) harbored predicted missense pathogenic variants exclusively within the case group.
Extended information about potential pathogenic variants in candidate genes is presented in Supplementary www.nature.com/scientificreports/ in patients from the control group with an allelic frequency of 8.93% and in patients belonging to the non long-COVID group with a frequency of 8.70%.This variant showed a significant association with asymptomatic/mild COVID-19 (p = 0.02) and no long-COVID clinical outcome (p = 0.01) and has been previously associated with influenza susceptibility 61,91 .

Predictive models
Genetic and clinical variables with significant association with the outcomes of interest, severe COVID-19 and long-COVID, were incorporated into binary logistic regression models.Three different predictive models for each of our main outcomes were built, a clinical model, a genetic model, and a mixed model.The best model was selected according to the Akaike information criteria (AIC) using the stepwise backward method.These comparisons showed that the mixed models have the best discriminatory power, both for severity (AUC = 0.86; 95% CI = 0.78-0.93)and for long COVID (AUC 0.83; 95% CI = 0.74-0.91).A complete comparison of these models, including selected variables, is shown in Tables 4, 5, and Fig. 2. Quality and model assumptions were validated identifying the absence of collinearity, with the variance inflation factor test (< 1.2), homoscedasticity, with the Breusch-Pagan test (p > 0.05), calibration with the Hosmer-Lemeshow test (p > 0.05), and error independence, with the Durbin-Watson test (p > 0.05).
For the severity COVID-19 predictive mixed model the variables included were sex, body mass index (BMI), presence of comorbidities, and the genetic variants rs2232354, rs11385942 and rs1819040, belonging to the genes IL1RN, LZTFL1 and KANSL1, respectively.The resulting predicting score is presented in Eq. 1.
COVID-19 severity predictive model.Where the adjusted score is a number between 0 and 1, "male" male sex, "BMI" body mass index, "comorb" presence of comorbidities and "WT/Alt" the presence of the alternative allele for each genetic variant.
(1)   COVID-19 predictive model.Where the adjusted score is a number between 0 and 1, "severe COVID-19" presence of severe disease, "anosmia", "fatigue" and "fever" refer to the presence of these symptoms and "WT/ Alt" presence of the rs8178521 variant.

Discussion
In just a matter of months, SARS-CoV-2 emerged as one of the most critical public health emergencies of the twenty-first century.Despite substantial progress in the understanding of this disease, the significant phenotypic variation in host responses and outcomes has not been fully elucidated 2,3,29 .This variability is influenced by several factors, encompassing viral and host-related characteristics.Host genetic factors constitute important risk factors for COVID-19 severity, mortality, and the presence of sequels.It is important to note that these genetic factors have remained understudied in Latin-American countries.In this study, we aim to characterize clinical and host genetic factors related to disease severity and long-COVID development in a sample of the

LoF variants
Predicted pathogenic missense variants

Total Cases Controls Cases Controls
Table 3. Number of patients with potential pathogenic variants per gene according to COVID-19 severity.www.nature.com/scientificreports/Colombian population.We identified multiple genetic and non-genetic risk factors associated with these outcomes.Furthermore, we incorporated these factors into two predictive models for our outcomes: disease severity and long-COVID.This study illustrates the potential usefulness of a combined strategy using clinical and genomic data to identify high-risk individuals in a specific population.Several non-genetic factors have demonstrated a substantial association with severe COVID-19 disease, including male sex, advanced age, and the presence of various comorbidities 92 .Our study supports such findings, revealing a significant association between male sex, obesity, and diabetes mellitus with more adverse outcomes.There is growing evidence suggesting that comorbidities play a role in the development of endothelial damage, promoting a prothrombotic and inflammatory status and higher viral replication, ultimately exacerbating clinical outcomes 93,94 .Regarding the clinical manifestations of the disease, respiratory and systemic signs and symptoms, including dyspnea, cough, odynophagia, fever, and fatigue, have shown a significant association with severe COVID-19 cases 95 .This association can be attributed to the immune-cytopathic effect of the virus on lung tissue.This effect leads to a systemic proinflammatory response and widespread viral dissemination, which, in turn, exacerbates symptoms through multiorgan involvement 96 .Interestingly, our study identified anosmia as a protective factor for severe disease, an observation previously made in other studies 97 .
Regarding non-genetic factors and long-COVID, we found that severe COVID-19 is associated with a higher prevalence of long-COVID, as previously reported 98 .We did not find any additional statistically significant  99 .We believe that these discrepancies are due to differences in the methodological design, as these conclusions have been mostly based on considerably older patients 100,101 .On the other hand, clinical manifestations during the acute phase of the disease, including respiratory and systemic signs and symptoms, show a correlation with long-COVID, in agreement with previous reports 102 .This finding suggests that these acute-phase symptoms might serve as indicators of vascular, pulmonary, and central nervous system damage 99 .
To date, it has been recognized that the response to COVID-19 infection is influenced by host genetic factors.Evidence from a study of twins, for instance, suggests a 50% heritability of COVID-19 risk 103 .Given the implication of these factors, several initiatives have been developed to identify risk variants and genes associated with COVID-19 severity and mortality.The methods include GWAS, whole exome sequencing, whole genome sequencing, and case-control associations 14,43 .Importantly, several authors have highlighted the limitations of these studies concerning the small number of variants or genes assessed and the underrepresentation of Latin-American populations.To the best of our knowledge, our study is the first to incorporate a custom NGS technique to evaluate host genetic factors contributing to both COVID-19 severity and long COVID within a Latin-American sample.
This study identified 13 genetic variants associated with COVID-19 severity.Several of these variants, mainly located in the critical loci 3p21.31 and 17q21.31,have been described as important risk factors.In agreement with previous GWAS studies, we found that rs11385942, an intron variant located in LZTFL1, shows the strongest association (p < 0.01; OR = 10.88) with severe or critical COVID-19 43,63,104 .These findings support the utility of this risk allele as a useful molecular prognostic biomarker in diverse populations.Conversely, we identified rs1819040, a variant located in KANSL1, as a protective allele against severe or critical disease (p = 0.03; OR = 0.37), as previously reported in other studies 43 .This variant was found in linkage disequilibrium with two intronic variants, rs62054835, and rs112572874, located in MAPT-AS1 and MAPT, respectively.Transcriptomewide association studies, GWAS, and eQTL studies have suggested the role of MAPT as a susceptibility gene for severe COVID-19 48 .Indeed, genetic variants within MAPT have been related to autoimmune diseases, normal lung function, and interstitial lung disease 105,106 .Additionally, we found a significant association between severe COVID-19 and rs35775079, a variant located in the intronic region of CCR3 (p = 0.02; OR = 8.53).CCR3 encodes a chemokine receptor highly expressed in eosinophils, basophils, TH1 and TH2 CD4 + T cells, and airway epithelial cells 107 .This receptor is an important mediator of allergic responses and genetic mouse model studies have demonstrated its crucial role in airway inflammatory cell infiltration 107,108 .It has been proposed that variants in this gene may impact the disease outcome through an excessive inflammatory response, one of the hallmarks of severe COVID-19, as well as of other severe respiratory virus infections 109,110 .
Currently, long-COVID symptoms are recognized as common sequelae of COVID-19 and represent a crucial focus of ongoing research.Similar to other reports, our research identifies an overall incidence of long-COVID, approximately 80% among non-severe COVID-19 patients and 40% among those with severe cases 111 .Remarkably, we identified 4 genetic variants associated with this clinical condition.The variant with the strongest association, rs8178521, is located within the IL10RB gene (p = 0.01; OR = 2.51).This variant has been previously linked to COVID-19 severity 45 .However, our study represents the first report suggesting its potential association with long-COVID.IL10RB encodes for a receptor of type III interferons and plays a pivotal role in immunomodulation through its regulation of IL-10 influencing the differentiation, proliferation, and cytokines production of mast cells 112 .Moreover, recent reports have suggested that the deregulated release of inflammatory mediators by mast cells is one of the potential mechanisms underlying the development of long-COVID 113,114 .
Our study identified 70 potential deleterious rare variants in candidate genes associated with the pathogenesis and immune response against SARS-CoV-2 infection.Rare and low-frequency variants have been shown to contribute to COVID-19 and other immune-related complex disorders 115,116 .However, despite these associations, our study did not find any significant difference in variant frequency within our study sample.This lack of significance could be attributed to limitations in the sample size of patients included in our study.Intriguingly, some of the LoF variants identified in this study were exclusively present in patients with severe or critical COVID-19.TLR3, for example, harbored 9 LoF variants in the case group compared to 0 among the controls.Other genetic variants in TLR3, such as rs3775291, have been related to an impairment in the immune response and associated with COVID-19 susceptibility and mortality 117 .Given the protective role of TLR3 and its function in innate immunity during SARS-CoV-2 infection, other potentially deleterious variants could similarly influence COVID-19 clinical outcomes.Likewise, we identified a potential deleterious missense variant, UGT2A1 c.576 T > A, (rs111696697) exclusively in patients with long-COVID (allele frequency 0.75).This gene is expressed in the olfactory epithelium and codifies for a protein member of the UDP-glycosyltransferase family which plays an important role as an odorant metabolizing enzyme 118 .Furthermore, UGT2A1/UGT2A2 has been associated with COVID-19 anosmia, one of the most frequent long COVID symptoms 70 .It should be highlighted that although some clinical and paraclinical predictors of long-COVID have been identified, the genetic factors related to this condition remain largely unknown.Identifying such factors could be useful to illuminate the biological and molecular basis of this disease.
In addition to genetic host variants, numerous studies have highlighted the role of viral genetic factors in COVID-19 pathogenicity, infectivity, and outcomes 119,120 .The appearance of variants of concern (VOC) and variants of interest (VOI), in particular, has been continuously monitored and evaluated since the beginning of the epidemic 121,122 .Although our study did not examine viral genetic factors, genomic surveillance studies conducted during the collection period of the samples (December 2020-July 2021) in Bogotá, indicated that the predominant variants were B.1.621(Mu) 57.3% (469/819), P.1 (Gamma) 14% (114/819), and B.1.1.7 (alpha) 2.8% (23/819) 22 .Therefore, the variant most detected during this period was Mu.This variant was later classified www.nature.com/scientificreports/as a variant being monitored (VBM) by the Centers for Disease Control and Prevention (CDC U.S.) and had no reports of significant effects of this variant on infectivity, transmissibility, or severity in contrast to VOIs.
While complex viral and host genetic interactions cannot be discarded, we estimate that patients among the groups can be compared since they were enrolled during the same period, when the previously mentioned viral variants were circulating.On the other hand, although Bogota is a large city, with an area of 1636 km 2 , the Hospital Universitario Mayor-Mederi and the private laboratory Genética Molecular de Colombia, where cases and controls were enrolled, respectively, are just 8 km away from each other.Also, it should be highlighted that controls were recruited from a different location than cases, given that Colombian healthcare policies advised to not attend hospitals for mild COVID-19 symptoms.As a result, there were limited options to include mild cases from hospital settings.As depicted, the hybrid models combining both clinical and genetic host variables constitute strong and reliable tools to predict COVID-19 outcomes.The biological basis of clinical variables has been discussed in previous models and reviews [123][124][125] .On the other hand, recent studies have integrated specific genetic variants into predictive models 38,126 .It is to be noted that the inclusion of variants from the IL1RN and KANSL1 genes in our model represents a novel approach.The absence of these variants in previous models may reflect differences in the genetic background of the studied populations and the complexity of the genetic architecture underlying COVID-19 outcomes.Thus, this study suggests that such a multivariable approach is a useful and innovative tool to identify high-risk individuals and prioritize limited health resources.We believe that such approaches are consistent with genomic and personalized medicine initiatives and may be useful for future pandemics.

Conclusions
This study analyzed the association between genetic and non-genetic factors with COVID-19 severity and the presence of long-COVID in a sample of the Colombian population.We found an association between these two outcomes and several genetic and non-genetic factors.The risk genetic variants are located in genes whose products participate in immunological signaling and humoral response against microorganisms.We highlight the usefulness of combining clinical and genomics data to develop models to predict COVID-19 response.Applying these predictive models in the clinical setting can help to identify high-risk individuals and focus resources and actions to reduce morbidity and mortality.

Limitations
Among the limitations of this study, we should mention that although the sample size might be sufficient to identify genetic variants with a medium or large effect, it may have been underpowered to detect the association of low-effect variants.The sample size, also, was calculated based on the available information on allele frequency.Third, we noticed that after the custom panel was designed and the probes were synthesized, novel candidate variants and genes were described in the literature.These were not included in this study and this fact highlights the importance of periodically updating NGS custom panel with clinical applications.On the other hand, although we took several measures to reduce potential bias, this may have been introduced during the interviews or clinical data collection.Finally, we should underline that the proposed models were not validated in a larger cohort, thus, more studies will be necessary to evaluate their accuracy and precision.

Figure 1 .
Figure 1.Enrollment process.The illustration depicts the process of enrollment, clinical follow-up and patient losses.

Table 1 .
Clinical and demographical characteristics of the studied population.*Statistically significant, p-value < 0.05; COPD, Chronic obstructive pulmonary disease; BMI, body mass index; Pack year , index that measures the amount smoked over a time period; CI, confidence intervals; OR, Odds ratio; RT-PCR, Reverse transcription polymerase chain reaction.† Variable that does not follow a normal distribution, its median was calculated (Sl Superior limit; Il Inferior limit).

Table 4 .
Comparison of clinical, genetic and mixed models for COVID-19 severity.AIC, Akaike information criteria; CI, confidence interval; OR, Odds Ratio; SD, Standard deviation.

Table 5 .
Comparison of clinical, genetic and mixed models for long-COVID.Note: AIC, Akaike information criteria; CI, confidence interval; OR, Odds Ratio; SD, Standard deviation.

Figure 2 .
Predictive model ROC curves.Comparison of receiver operating characteristic curve (ROC) curves derived from the different predictive models.ROC curves for clinical, genetic, and mixed predictive models for COVID-19 severity (A) and long-COVID (B).associations between long COVID and either presymptomatic clinical or demographic variables, contrary to what has been reported in other studies.Previous research has indicated that patients over 50 years old and those with multiple comorbidities are more likely to experience long-COVID